Data farm: Information system for collecting, storing and processing unstructured data from heterogeneous sources

نویسندگان

چکیده

The original information system «data farm» is presented. Today, the successful application of artificial intelligence algorithms, primarily deep learning based on neural networks, almost completely depends availability data. And larger amount these data (big data), better are results algorithms execution. There well-known examples such from Facebook, Google, Microsoft, Yandex, etc. must contain both training sample and test one. Moreover, be good quality have a certain structure, ideally, labeled in order for to work adequately. This serious problem requiring huge computational human resources. paper dedicated solve this problem. Today farm rather complex built modular basis, similar Lego constructor. Separate modules various modern technologies entire libraries intelligence, all together they designed automate process obtaining structuring high-quality big subject domains. has been tested COVID-19 regions Russia countries around world. In addition, user-friendly interface visualizing collected processed was developed. makes it possible conduct visual numerical experiments computer simulation compare them with real data, turning into an intelligent decision support system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Realization of an Automated Data Flow for Data Collecting , Processing , Storing and Retrieving

GEONET is a database system developed at the Stanford Linear Accelerator Center for the alignment of the Stanford Linear Collider. It features an automated data flow, ranging from data collection using HP110 handheid computers to processing, storing and retrieving data and finally to adjusted coordinates. This paper gives a brief introduction to the SLC project and the applied survey methods. I...

متن کامل

Information Integration for Heterogeneous Data Sources

Information Retrieval from heterogeneous information systems is required but challenging at the same as data is stored and represented in different data models in different information systems.Information integrated from heterogeneous data sources into single data source are faced upon by major challenge of information transformationwere in different formats and constraints in data transformati...

متن کامل

Integrating and Processing Events from Heterogeneous Data Sources

Environmental monitoring studies present many challenges. A huge amount of data are provided in different formats from different sources (e.g. sensor networks and databases). This paper presents a framework we have developed to overcome some of these problems, based on combining aspects of Enterprise Service Bus (ESB) architectures and Event Processing mechanisms. First, we treat integration us...

متن کامل

Creating Relational Data from Unstructured and Ungrammatical Data Sources

In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the Web is neither grammatical nor formally structured, making querying difficult. Examples of these types of data sources are online classifieds like Craigslist and auction item listings like eBay. We call this unstructu...

متن کامل

Ontology-based information extraction and integration from heterogeneous data sources

In this paper we present the design, implementation and evaluation of SOBA, a system for ontology-based information extraction from heterogeneous data resources, including plain text, tables and image captions. SOBA is capable of processing structured information, text and image captions to extract information and integrate it into a coherent knowledge base. To establish coherence, SOBA interli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Trudy Instituta sistemnogo programmirovaniâ

سال: 2023

ISSN: ['2079-8156', '2220-6426']

DOI: https://doi.org/10.15514/ispras-2023-35(2)-5